Skip to content
This repository has been archived by the owner on Jan 19, 2022. It is now read-only.

Kubelet race not setting correct pod env (KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT) #135

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vmansolas
Copy link

@vmansolas vmansolas commented Oct 12, 2021

KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT are sometimes not properly set by kubelet because of the known issue/race condition: kubernetes/kubernetes#40973.
When running a QuarkJob, the output-persist container continues with the default configuration, that later fails when running in a kubernetes cluster:
#output-persist
2021-09-22T09:48:28.108Z [34mINFO[0m internal/persist_output.go:79 does not exist, using default kube config
2021-09-22T09:48:28.109Z [34mINFO[0m internal/persist_output.go:83 Checking kube config
2021/09/22 09:48:28 Couldn't check Kubeconfig. Ensure kubeconfig is correct to continue.: invalid kube config: Get "http://localhost:8080/version?timeout=32s": dial tcp [::1]:8080: connect: connection refused

This change introduces a workaround as per the proposed solution (see https://github.com/kubernetes/kubernetes/blob/fa0387c9fea7cf4b3e3032bb384d8b5f99580154/pkg/kubelet/kubelet_pods.go#L395-L398):

  • if the KUBERNETES_SERVICE_HOST env variable is not set, a lookup operation is performed for "kubernetes.default.svc" to look for the default kubernetes API.
  • If the Lookup is successful the KUBERNETES_SERVICE_HOST env variable is set to the lookup result and port is set to 443.
  • The flow continues normally reporting "Using in-cluster kube config"...

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 12, 2021

CLA Signed

The committers are authorized under a signed CLA.

@vmansolas vmansolas marked this pull request as ready for review October 21, 2021 12:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants